Cohen - 2021 - Linear Algebra theory, intuition, code¶
# import libraries
%load_ext autoreload
%autoreload 2
import numpy as np
import scipy as sp
from IPython.display import Image
Keys:
AI safety application
5.5 Matrix Zoology¶
Square or rectangular¶
skew-symmetric¶
Why would we want a Skew-symmetric matrix?
- they don't have much of an application (for ai safety)
Identity¶
display(np.array(([1,0,0],[0,1,0],[0,0,1]))) # manually
display(np.identity(3)) # using np.identity
display(np.array_equal(np.array(([[1,0,0],[0,1,0],[0,0,1]])), np.identity(3))) # sanity check
array([[1, 0, 0],
[0, 1, 0],
[0, 0, 1]])
array([[1., 0., 0.],
[0., 1., 0.],
[0., 0., 1.]])
True
What application does it have to ai safety?
Identity matrix has 1s on the diagonal. any other number makes it a diagonal matrix.
initialization (of starting state)
- what would happen if we were to initialize all at zero?
- the agent would struggle to learn
- why is that?
- gradient flow.
- pre-req:
* [ ] understand backprop
* all zero: * identity: * modified: * Xavier: * Glorot: * He:
- pre-req:
* [ ] understand backprop
- dead neuron problem.
- all zeros:
- identity:
- gradient flow.
- why is that?
- the agent would struggle to learn
- what would happen if we were to initialize all at zero?
robotics control
- "do nothing in terms of movement or orentational change"
zeros¶
display(Image('data/17072024_194816.png'))
a = np.array(([[0,1,2,1],[3,4,5,4],[6,7,8,7],[9,10,11,10]])) # A
i = np.eye(4) # identity matrix
o = np.ones(4) # (4, 4) ones
z = np.zeros((4, 4)) # (4, 4) zeros
# sanity checks
display(np.array_equal(np.dot(a,i), a))
display(np.array_equal(a+i, a))
display(np.array_equal(a+z, a))
display(np.array_equal(np.dot(a,z), a))
True
False
True
False
display(Image('data/17072024_195232.png'))
a = np.array(([10,5,1],[15,6,2],[20,7,3],[25,8,4]))
# full-rank?
display(np.linalg.matrix_rank(a))
# full column-rank?
# invertible?
# orthogonal eigenvectors?
# positive (semi) definite?
# eigenvalues?
np.dot(a.T,a)
# 1)
2
array([[1350, 480, 200],
[ 480, 174, 70],
[ 200, 70, 30]])
A = np.array([
[1, 2, 3, 4], # Sensor 1 data
[2, 4, 6, 8], # Sensor 2 data (redundant)
[3, 6, 9, 12], # Sensor 3 data (redundant)
[4, 5, 6, 7] # Sensor 4 data (unique)
])
np.linalg.matrix_rank(A)
2
Diagonal¶
why is it important?
- f
how does it differ from the Identity? (of course the ones are replaced with other digits but for what reason?)
- f
diagonal vs anti/counter diagonal?
- f
Anti-Diagonal¶
d = np.diag([1,2,3,4,5,6]) #diagonal
ad = np.fliplr(d) # anti-diagonal
# ad = # anti-diagonal
# np.dot(d, d.T)
# display(d, d.T)
# why does diagonal not satisfy. "The transpose of an anti-diagonal matrix is also an anti-diagonal matrix." as well?
sev = np.array(([[7,0],[0,7]]))
display(
np.array(([[7]])) * np.eye(2)
)
display(np.hstack((np.diag([2,6], k=0) ,np.zeros((2,2)))))
array([[7., 0.],
[0., 7.]])
array([[2., 0., 0., 0.],
[0., 6., 0., 0.]])
Augmented¶
- used to assist in solving linear equations (using Gaussian elimination)
- we already established that in terms of computing this is very dangerous & should be only used in temporary circumstances.
- we are left with no metadata but just a new matrix, we should keep them seperate at all costs
- we already established that in terms of computing this is very dangerous & should be only used in temporary circumstances.
$3x,3y,-z = 4$
$5x,-4y,6z = 5$
$x,2y,-z = 2$
prob = np.array(([[3,3,-1],[5,-4,6],[1,2,-1]]))
sol = np.array(([[4],[5],[2]]))
ans = np.linalg.solve(prob, sol)
print(f" x = {ans[0][0]} \n y = {ans[1][0]} \n z = {ans[2][0]}")
x = -0.20000000000000107 y = 2.400000000000002 z = 2.6000000000000023
Triangular¶
a = np.array(([[1,2,3,1],[4,5,6,2],[7,8,9,3],[8,9,10,4]]))
l = np.tril((a))
u = np.triu((a))
display(a, l, u)
array([[ 1, 2, 3, 1],
[ 4, 5, 6, 2],
[ 7, 8, 9, 3],
[ 8, 9, 10, 4]])
array([[ 1, 0, 0, 0],
[ 4, 5, 0, 0],
[ 7, 8, 9, 0],
[ 8, 9, 10, 4]])
array([[1, 2, 3, 1],
[0, 5, 6, 2],
[0, 0, 9, 3],
[0, 0, 0, 4]])
Dense and sparse¶
display(Image('data/18072024_115015.png'))
a = np.zeros((10, 10))
a[0,2] = 4
a[3,5] = 8
print(sp.sparse.csr_matrix(a))
a
<Compressed Sparse Row sparse matrix of dtype 'float64' with 2 stored elements and shape (10, 10)> Coords Values (0, 2) 4.0 (3, 5) 8.0
array([[0., 0., 4., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 8., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]])
Orthogonal¶
x = 10
A = np.array(([
[np.cos(x),np.sin(x)],
[-np.sin(x), np.cos(x)]
]))
A_round = np.round(A, decimals=3)
display(A)
# All of its columns are pairwise orthogonal. That means that the dot product between any two columns is exactly 0.
display(np.dot(A_round[:, 0], A_round[:, 1]))
display(np.dot(A_round[:, 1], A_round[:, 0]))
display(np.round(np.dot(A.T, A), decimals=1))
array([[-0.83907153, -0.54402111],
[ 0.54402111, -0.83907153]])
0.0
0.0
array([[ 1., -0.],
[-0., 1.]])
display(
np.dot(A[:, 0],A[:, 0]),
np.dot(A[:, 1],A[:, 1])
)
display(A[0, 0] * A[0, 0] + A[1, 0] * A[1, 0]) # dot product manual
1.0
1.0
1.0
Toeplitz¶
while researching these, i encountered alot of bold claims such as:
- efficient storage
- fast matrix-vector multiplcation
it should be noted that this is not the idea of transformation where we say change from .jpeg to .png.
but rather it is when we find that data already contains some simularities to the Toeplitz matrix (naturally), and in this sense we are able to reap the above benefits.
import plotly.graph_objects as go
np.random.seed(0)
matrix = np.arange(0,14)
matrix_toe = sp.linalg.toeplitz(matrix)
display(matrix_toe)
fig = go.Figure(data=go.Heatmap(z=matrix_toe))
fig.update_layout(title="Toeplitz. (14, 14)")
fig.show()
# display(matrix_toe) # np.array(([[0,1,2,3],[1,0,1,2],[2,1,0,1],[3,2,1,0]]))
# matrix_toe.shape # shape: (4, 4)
array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13],
[ 1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12],
[ 2, 1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11],
[ 3, 2, 1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
[ 4, 3, 2, 1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[ 5, 4, 3, 2, 1, 0, 1, 2, 3, 4, 5, 6, 7, 8],
[ 6, 5, 4, 3, 2, 1, 0, 1, 2, 3, 4, 5, 6, 7],
[ 7, 6, 5, 4, 3, 2, 1, 0, 1, 2, 3, 4, 5, 6],
[ 8, 7, 6, 5, 4, 3, 2, 1, 0, 1, 2, 3, 4, 5],
[ 9, 8, 7, 6, 5, 4, 3, 2, 1, 0, 1, 2, 3, 4],
[10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0, 1, 2, 3],
[11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0, 1, 2],
[12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0, 1],
[13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0]])
Hankel¶
np.set_printoptions(linewidth=300)
matrix = np.arange(0, 14)
matrix_hank = sp.linalg.hankel(matrix)
fig = go.Figure(data=go.Heatmap(z=matrix_hank))
fig.update_layout(title="Hankel. (14, 14)")
fig.show()
display(Image('data/18072024_143555.png'))